Determining Semantic Textual Similarity using Natural Deduction Proofs
نویسندگان
چکیده
Determining semantic textual similarity is a core research subject in natural language processing. Since vector-based models for sentence representation often use shallow information, capturing accurate semantics is difficult. By contrast, logical semantic representations capture deeper levels of sentence semantics, but their symbolic nature does not offer graded notions of textual similarity. We propose a method for determining semantic textual similarity by combining shallow features with features extracted from natural deduction proofs of bidirectional entailment relations between sentence pairs. For the natural deduction proofs, we use ccg2lambda, a higherorder automatic inference system, which converts Combinatory Categorial Grammar (CCG) derivation trees into semantic representations and conducts natural deduction proofs. Experiments show that our system was able to outperform other logicbased systems and that features derived from the proofs are effective for learning textual similarity.
منابع مشابه
The Meaning Factory: Formal Semantics for Recognizing Textual Entailment and Determining Semantic Similarity
Shared Task 1 of SemEval-2014 comprised two subtasks on the same dataset of sentence pairs: recognizing textual entailment and determining textual similarity. We used an existing system based on formal semantics and logical inference to participate in the first subtask, reaching an accuracy of 82%, ranking in the top 5 of more than twenty participating systems. For determining semantic similari...
متن کاملA Logic Prover Approach to Predicting Textual Similarity
This paper presents a logic prover approach to predicting textual similarity. Sentences are represented using three logic forms capturing different levels of knowledge, from only content words to semantic representations extracted with an existing semantic parser. A logic prover is used to find proofs and derive semantic features that are combined in a machine learning framework. Experimental r...
متن کاملMeasuring Semantic Similarity for Bengali Tweets Using WordNet
Similarity between natural language texts, sentences in terms of meaning, known as textual entailment, is a generic problem in the area of computational linguistics. In the last few years researchers worked on various aspects of textual entailment problem, but mostly restricted to English language. Here in this paper we present a method for measuring the semantic similarity of Bengali tweets us...
متن کاملCode Similarity via Natural Language Descriptions
Code similarity is a central challenge in many programming related applications, such as code search, automatic translation, and plagiarism detection. In this work, we reduce the problem of semantic relatedness between code fragments into a problem of semantic relatedness of textual descriptions. Our main idea is that we can use the relationship between code and its textual descriptions as esta...
متن کاملProbabilistic Soft Logic for Semantic Textual Similarity
Probabilistic Soft Logic (PSL) is a recently developed framework for probabilistic logic. We use PSL to combine logical and distributional representations of natural-language meaning, where distributional information is represented in the form of weighted inference rules. We apply this framework to the task of Semantic Textual Similarity (STS) (i.e. judging the semantic similarity of naturallan...
متن کامل